An Efficient Method for Making Un-Supervised Adaptation of HMM-based Speech Recognition Systems Robust Against Out-Of-Domain Data

نویسندگان

Thomas Plötz

Gernot A. Fink

چکیده

Major aspects of cognitive science are based on natural language processing utilizing automatic speech recognition (ASR) systems in scenarios of human-computer interaction. In order to improve the accuracy of related HMM-based ASR systems efficient approaches for un-supervised adaptation represent the methodology of choice. The recognition accuracy of speaker-specific recognition systems derived by online acoustic adaptation directly depends on the quality of the adaptation data actually used. It drops significantly if sample data outof-scope (lexicon, acoustic conditions) of the original recognizer generating the necessary annotation is exploited without further analysis. In this paper we present an approach for fast and robust MLLR adaptation based on a rejection model which rapidly evaluates an alternative to existing confidence measures, so-called log-odd scores. These measures are computed as ratio of scores obtained from acoustic model evaluation to those produced by some reasonable background model. By means of log-odd scores threshold based detection and rejection of improper adaptation samples, i.e. out-of-domain data, is realized. By means of experimental evaluations on two challenging tasks we demonstrate the effectiveness of the proposed approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Alert correlation and prediction using data mining and HMM

Intrusion Detection Systems (IDSs) are security tools widely used in computer networks. While they seem to be promising technologies, they pose some serious drawbacks: When utilized in large and high traffic networks, IDSs generate high volumes of low-level alerts which are hardly manageable. Accordingly, there emerged a recent track of security research, focused on alert correlation, which ext...

متن کامل

تخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت

The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

An Efficient Method for Making Un-Supervised Adaptation of HMM-based Speech Recognition Systems Robust Against Out-Of-Domain Data

نویسندگان

چکیده

منابع مشابه

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Alert correlation and prediction using data mining and HMM

تخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

عنوان ژورنال:

اشتراک گذاری